iPromoter-2L: a two-layer predictor for identifying promoters and their types by multi-window-based PseKNC

Bioinformatics. 2018 Jan 1;34(1):33-40. doi: 10.1093/bioinformatics/btx579.

Abstract

Motivation: Being responsible for initiating transaction of a particular gene in genome, promoter is a short region of DNA. Promoters have various types with different functions. Owing to their importance in biological process, it is highly desired to develop computational tools for timely identifying promoters and their types. Such a challenge has become particularly critical and urgent in facing the avalanche of DNA sequences discovered in the postgenomic age. Although some prediction methods were developed, they can only be used to discriminate a specific type of promoters from non-promoters. None of them has the ability to identify the types of promoters. This is due to the facts that different types of promoters may share quite similar consensus sequence pattern, and that the promoters of same type may have considerably different consensus sequences.

Results: To overcome such difficulty, using the multi-window-based PseKNC (pseudo K-tuple nucleotide composition) approach to incorporate the short-, middle-, and long-range sequence information, we have developed a two-layer seamless predictor named as 'iPromoter-2 L'. The first layer serves to identify a query DNA sequence as a promoter or non-promoter, and the second layer to predict which of the following six types the identified promoter belongs to: σ24, σ28, σ32, σ38, σ54 and σ70.

Availability and implementation: For the convenience of most experimental scientists, a user-friendly and publicly accessible web-server for the powerful new predictor has been established at http://bioinformatics.hitsz.edu.cn/iPromoter-2L/. It is anticipated that iPromoter-2 L will become a very useful high throughput tool for genome analysis.

Contact: bliu@hit.edu.cn or dshuang@tongji.edu.cn or kcchou@gordonlifescience.org.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • DNA, Bacterial / metabolism
  • DNA-Directed RNA Polymerases / metabolism
  • Escherichia coli / enzymology
  • Escherichia coli / genetics*
  • Escherichia coli Proteins / metabolism
  • Genome, Bacterial
  • Genomics / methods*
  • Promoter Regions, Genetic*
  • Sequence Analysis, DNA / methods*
  • Software*

Substances

  • DNA, Bacterial
  • Escherichia coli Proteins
  • DNA-Directed RNA Polymerases